Search 



ID No 9 

AC015701 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 



AC015701 174149 bp DNA HTG 07-APR-2000 

Homo sapiens chromosome 11 clone RP11-210K21 map 11, WORKING DRAFT 
SEQUENCE, 24 unordered pieces. 
AC015701 

AC015701.3 GI:7523736 

HTG; HTGS_PHASE1; HTGS_DRAFT. 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 174149) 

Birren,B., Linton, L., Nusbaum,C. and Lander, E. 
Homo sapiens chromosome 11, clone RP11-210K21 
Unpublished 

2 (bases 1 to 174149) 

Birren,B., Linton, L., Nusbaum, C, Lander, E., Allen, N., Anderson, M. 
Baldwin, J., Barna,N., Beckerly,R., Boguslavkiy, L . , Boukhgalter , B . , 
Brown, A., Castle, A., Colangelo, M. , Collins, S., Collymore, A. , 
Cooke, P., DeArellano, K. , Dewar,K., Domino, M., Donelan,L., Doyle, M. 
Ferreira,P., FitzHugh,W., Forrest, C, Funke,R., Gage,D., 
Galagan,J., Gardyna,S., Grant, G., Hagos,B., Heaford,A., Horton,L., 
Howland, J. C . , Johnson, R., Jones, C, Kann,L., Karatas,A., Klein, J., 
Lehoczky,J., Lieu,C., Locke, K., Macdonald, P . , Marquis, N., 
McEwan,P., McGurk,A., McKernan,K., McLaughlin, J. , Meldrim, J., 
Morrow, J., Naylor,J., Norman, C.H., O'Connor, T., O ' Donnell , P . , 
Peterson, K., Pollara,V., Riley, R., Roy, A., Santos, R., Severy,P., 
Stange-Thomann,N. , Stojanovic, N , , Subramanian, A. , Talamas, J. , 
Tesfaye,S., Tirrell,A. , Vassiliev, H . , Vo,A., Wheeler, J., Wu,X., 
Wyman,D., Ye,W.J., Zimmer,A. and Zody,M. 
Direct Submission 

Submitted { 17-NOV-1999 ) Whitehead Institute/MIT Center for Genome 
Research, 320 Charles Street, Cambridge, MA 02141, USA 
On Apr 7, 2000 this sequence version replaced gi: 6479169. 
All repeats were identified using RepeatMasker : 
Smit, A.F.A. & Green, P. (1996-1997) 

http : //ftp , genome . Washington . edu/RM/RepeatMasker . html 
Genome Center 

Center: Whitehead Institute/ MIT Center for Genome Research 

Center code: WIBR 

Web site: http://www-seq.wi.mit.edu 

Contact: sequence_submissions @genome . wi .mit.edu 
Project Information 

Center project name: L1373 

Center clone name: 210_K_21 
Summary Statistics 

Sequencing vector: M13; M77815; 100% of reads 

Chemistry: Dye-terminator Big Dye; 100% of reads 

Assembly program: Phrap; version 0.960731 

Consensus quality: 144096 bases at least Q40 

Consensus quality: 161320 bases at least Q30 

Consensus quality: 168048 bases at least Q20 

Insert size: 187000; agarose-fp 

Insert size: 171849; sum-of-contigs 

Quality coverage: 3.5 in Q20 bases; agarose-fp 

Quality coverage: 3.8 in Q20 bases; sum-of-contigs 



* NOTE: This is a * working draft* sequence. It currently 



consists of 24 contigs . The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are represented as 
runs of N, but the exact sizes of the gaps are unknown . 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 
be preserved. 
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FEATURES 

source 



1827: 
1927: gap 
3978: 
4078: gap 
6795: 
6895: gap 
9683: 
9783: gap 
11884: 
11984: gap of 



of 

contig of 2051 bp in length 

of 100 bp 

contig of 2717 bp in length 



of 



100 bp 



contig of 2788 bp in length 



of 



100 bp 



contig of 2101 bp in length 
100 bp 

16473: contig of 4489 bp in length 



16573: gap of 

20142 : contig 
20242: gap of 



100 bp 
of 3569 bp 
100 bp 



in length 



24886: contig of 4644 bp in length 
24986: gap of 100 bp 

29158: contig of 4172 bp in length 
29258: gap of 100 bp 

33784: contig of 4526 bp in length 
33884: gap of 100 bp 

38874: contig of 4990 bp in length 
38974: gap of 100 bp 

44050: contig of 5076 bp in length 
44150: gap of 10.0 bp 

49356: contig of 5206 bp in length 
49456: gap of 100 bp 

54065: contig of 4609 bp in length 
54165: gap of 100 bp 

60407: contig of 6242 bp in length 
60507: gap of 100 bp 

68554: contig of 8047 bp in length 
68654: gap of 100 bp 

77303: contig of 8649 bp in length 
77403: gap of 100 bp 

88452: contig of 11049 bp in length 
88552: gap of 100 bp 

97198: contig of 8646 bp in length 



97298: gap of 



100 bp 



111202: contig of 13904 bp in length 



111302: gap of 



100 bp 



122947: contig of 11645 bp in length 



123047: gap of 



100 bp 



137344: contig of 14297 bp in length 



137444: gap of 



100 bp 



150883: contig of 13439 bp in length 



150983: gap of 



100 bp 



174149: contig of 23166 bp in length. 
Location/Qualifiers 
1. .174149 

/organism="Homo sapiens" 
/db_xref-"taxon: 9606" 
/ chromosome="ll" 
/inap="ll" 



/clone="RPll-210K21" 

/clone lib="RPCI-ll Human Male BAG" 
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ORIGIN 



Query Match 49.0%; Score 83.8; DB 73; Length 174149; 

Best Local Similarity '70.8%; Pred. No. 4.5e-15; 



Matches 114; Conservative 0; Mismatches 4 6; Indels 1; Gaps 



Qy 4 cacanangannggncntgtgaggacacagcnagaagcaagtctntgcatgncnagaagaa 63 

I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 

Db 5282 CACAGAAGAGAGGCCATGGGAGGACACAGAGAGAAGGTGGTGTCTACAAGCCGAGGAGAG 5341 

Qy 64 cggcctcaacagacaccanncctgccagcaccttgatcttggcttntggcctccagaact 123 

III II I I I I II I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I M I 
Db 5342 AGGCTGCACCAGAAACCAACCCTGCCAGCAGTTTGATCTTGGACTTCAGCCTCCAGAACT 54 01 

Qy 124 gtgaaagantaaagattctgttgtttaagccagtacaaaat 164 

II I I I I I I I I I I I I II I II I M I I II II 

Db 5402 GTGAGTiAAATAAA-CTTCTGTTGTTTAGGTCCCCCCAGCAT 5441 



SEQ ID NO: 9 

Q29033/C 

ID Q29033 standard; DNA; 279 BP. 

AC Q29033; 

DT 23-FEB-1993 (first entry) 

DE Low frequency repeat probe LF18. 

KW Alul restriction digest; genetic mapping; human chromosome; 

KW polymerase chain reaction; XLl/pCDLF18; ATCC 68558; ss. 

OS Homo sapiens. 

PN EP-505605-A. 

PD 30-SEP-1992. 

PF ll-APR-1991; 105802. 

PR 28-MAR-1991; US-676292. 

PA (UYWA-) UNIV WAYNE STATE. 

PI Duncan CH, Kaplan DJ, Solus JF; 

DR WPI; 92-324992/40. 

PT New nucleic acid probes - have a labelled low frequency 

PT repetitive sequence for detecting overlaps among cloned DNA 

PS Claim 18; Page 34; 41pp; English. 

CC Probe LF18 (ATCC 68558) is taken from an unidentified locus in the 

CC human genome according to according to unpublished data from the 

CC applicants' laboratory. The probe was obtained by PCR amplification 

CC of human placental DNA (see Q29014 and Q29015) . LF18 can be used for 

CC detecting overlaps among cloned DNA molecules in genetic mapping. 

CC See Q29012-Q29017 and Q29021-Q29038 . 

SQ Sequence 279 BP; 59 A; 56 C; 65 G; 99 T; 

Query Match 32.4%; Score 55.4; DB 1; Length 279; 

Best Local Similarity 67.3%; Pred. No. 5e-10; 

99; Conservative 0; Mismatches 43; Indels 5; Gaps 

innggncntgtgaggacacagcnagaagcaagtctntgcatgncnagaagaacggcctc 70 
II I I I I I I I I I II I I I I II I I I I I I I I III I I I I 



I II II I II I I I I I I II II II I II I I II I I I I I I I I I I I I I II 
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SEQ ID NO 15 

AQ534984/C 

LOCUS AQ534984 478 bp DNA GSS 18-MAY-1999 

DEFINITION RPCI-11-380E18.TJ RPCI-11 Homo sapiens genomic clone RPCI-11- 

380E18, genomic survey sequence. 
ACCESSION AQ534984 

VERSION AQ534984.1 GI: 4846674 

KEYWORDS GSS. 
SOURCE human. 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 478) 



AUTHORS 

TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Zhao,S., Adams, M. D. , Nierman,W., Malek,J., de Jong, P. and 
Venter, J. C. 

Use of BAC End Sequences from Library RPCI-11 for Sequence-Ready 
Map Building 
Unpublished (1997) 

On Dec 15, 1999 this sequence version replaced gi: 4214464. 
Other_GSSs: RPCI-11-380E18 . TV 

Contact: Shaying Zhao, William Nierman, Mark Adams 

Department of Eukaryotic Genomics 

The Institute for Genomic Research 

9712 Medical Center Dr., Rockville, MD 20850 

Tel: 301 838 0200 

Fax: 301 838 0208 

Email : hbe@tigr . org 

Clones are derived from the human BAC library RPCI-11. For BAC 
library availability, please contact Pieter de Jong 
(pieter@dejong.med.buffalo.edu). Clones may be purchased from 
BACPAC Resources (http://bacpac.med.buffalo.edu/ordering) or from 
Research Genet cs (info@resgen.com). BAC end search page: 
http : //www. tigr . org/tdb/humgen/bac_end_search/bac_end_search . html . 
Seq primer: SP6 
Class: BAC ends. 

Location/Qualifiers 

1. .478 

/organism="Homo sapiens" 

/db_xref="GDB: 7645649" 

/db_xref="taxon:9606" 

/clone="RPCI-ll-380E18" 

/clone_lib="RPCI-ll" 

/sex="Male" 

/ cell_type= " Lymphocytes " 

/note="Vector : pBACe3.6; Site_l: EcoRI; Site_2 : EcoRI; 
RPCIll Human Male BAC Library" 
149 a 68 c 130 g 131 t 



Query Match 99. 0%; 

Best Local Similarity 99.4%; 
Matches 165; Conservative 



Score 164.4; DB 105; 
Pred. No. 4.6e-42; 
0; Mismatches 1; Indels 



Length 478; 

0; Gaps 



0; 



Qy 



Db 



460 



tcagtatcctgacctggcaaggtgttccttaacctcccctctggatcccccttagcacac 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I M M 

TCAGTATCCTGACCTGGCAAGGTGTTCCTTAACCTCCCCTCTGGATCCCCCTTAGCACAC 401 



Qy 61 atctgggacaatggagcgttcagcaccacggacagcattacaccctcttcaagtgcttgt 120 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 400 ATCTGGGACAATGGAGCGTTCAGCACCACGGACAGCATTACACCCTCTTCAAGTGCTTGT 341 

Qy 121 taaggccatttgtctatttcactctcaagtaaataaaaatattttt 166 

III I I I I I II I I I I I I I II I I I I I I I I II I I I I II I I I II I I I II 
Db 340 TAAAGCCATTTGTCTATTTCACTCTCAAGTAAATAAAAATATTTTT 295 



SEQ ID NO: 18 

T03943/C 

ID T03943 standard; DNA; 4823 BP. 
AC T03943; 



DT 29-APR-1996 (first entry) 

DE Human thrombopoietin genomic coding sequence. 

KW Thrombopoietin; erythropoiesis stimulator; 

KW haematopoietic polypeptide; treatment; thrombocytopenia; anaemia; 

KW ds. 

OS Homo sapiens . 

PN W09521626-A1. 

PD 17-AUG-1995. 

PF 09-FEB-1995; U01829. 

PR 14-FEB-1994; US-196025. 

PR 25-FEB-1994; US-203197. 

PR 21-MAR-1994; US-215203. 

PR Ol-JUN-1994; US-252491. 

PR 09-AUG-1994; US-288417. 

PR 07-NOV-1994; US-335566. 

PR Ol-DEC-1994; US-347748. 

PA (UNIW ) UNIV WASHINGTON. 

PI Kaushansky K; 

DR WPI; 95-292944/38. 

DR P-PSDB; R82684. 

PT Stimulation of erythropoiesis using thrombopoietin and opt. 

PT erythropoietin - for the treatment of thrombocytopenia and anaemia. 

PS Disclosure; Page 47-52; 66pp; English. 

CO This sequence corresponds to a single allele of the human 

CO thrombopoeitin gene. Thrombopoietin stimulates erythropoiesis to 

CO produce an increase in proliferation or differentiation of 

CO erythroid cells or to* increase reticulocyte counts at least 2-fold 

CO over baseline reticulocyte counts and, optionally, platelet levels 

CO to at least 20000/cu mm. The protein can be used in a composition, 

CO optionally with erythropoietin, for use in the treatment of 

CC thrombocytopenia and anaemia, such as that caused by destruction of 

CC haematopoietic cells in bone marrow, in the treatment of cancer 

CC with chemotherapy and radiation, and in pathological conditions 

CC such as myelodysplasia, AIDS, aplastic anaemia, autoimmune disease 

CC or inflammatory disease, 

SQ Sequence 4823 BP; 1205 A; 1368 C; 1048 G; 1202 T; 



Query Match 60.0%; Score 61.8; DB 1; Length 4823; 

Best Local Similarity 80.0%; Pred. No. 5.2e-12; 

Matches 72; Conservative 0; Mismatches 18; Indels 0; Gaps 

Qy 7 tccaagctactcagaagactgaagcagaaggatcacttgaggccaggagttcaagatcag 66 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2917 TCCCAGCACTTTGGGAGGCTGAGGCAGGTGGATCACCTGAGGTCAGGAGTTCAAGATCAG 2 858 

Qy 67 cctgagcaacatagngaaaccctatctcta 96 

I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2857 CCTGCCCAACATGGTGAAACCCCATCTCTA 2828 



Query= SEQ ID NO : 9 

(171 letters) 



Score E 

Sequences producing significant alignments: (bits) Value 

AC012640 ACCESSION:AC012640 NID : gi 27356677 gb AC012640.12 Ho... 149 2e-33 
AC034241 ACCESSION :AC034241 NID: gi 17975241 gb AC034241.4 Horn... 149 2e-33 

>AC012640 ACCESSION :AC012640 NID: gi 27356677 gb AC012.640.12 Homo 

sapiens chromosome 5 clone CTD-2256P15, complete sequence 
Length = 145122 

Score = 149 bits (75) , Expect = 2e-33 
Identities = 124/137 (90%), Gaps = 2/137 (1%) 
Strand = Plus / Plus 

Query: 2 0 tgtgaggacacagcnagaagcaagtctntgcatgncnagaagaacggcctcaacagacac 7 9 

llllllllllllll llllllllll I MM I I MMMI MMMMMIMM 

Sbjct: 54846 tgtgaggacacagcgagaagcaagtatctgcaagtcaagaagaaaggcctcaacagacac 54905 
Query: 80 canncctgccagcaccttgatcttgg-cttntggcctccagaactgtgaaagantaaaga 13 8 

M MMMMMMMMMMM Ml MMMMMMMMMMM MM I 

Sbjct: 54906 cagccctgccagcaccttgatcttggacttctggcctccagaactgtgaaagaataaa-a 54964 

Query: 139 ttctgttgtttaagcca 155 

IIIIIIMIIIIIIIII 
Sbjct: 54965 ttctgttgtttaagcca 54981 



>AC034241 ACCESSION:AC034241 NID: gi 17975241 gb AC034241.4 Homo 

sapiens chromosome 5 clone CTD-2 3 6002 0, complete sequence 
Length = 77702 

Score = 149 bits (75), Expect = 2e-33 
Identities = 124/137 (90%), Gaps = 2/137 (1%) 
Strand = Plus / Plus 

Query: 20 tgtgaggacacagcnagaagcaagtctntgcatgncnagaagaacggcctcaacagacac 79 

MMMMIMMI MMMMM I MM I I MMMI IIMIIIIMMMI 

Sbjct : 10337 tgtgaggacacagcgagaagcaagtatctgcaagtcaagaagaaaggcctcaacagacac 10396 
Query: 80 canncctgccagcaccttgatcttgg-cttntggcctccagaactgtgaaagantaaaga 13 8 

II MMMMMMMMMMM III 1 1 II 1 1 II 1 1 1 1 1 1 II 1 1 II 1 1 MM I 

Sbjct: 10397 cagccctgccagcaccttgatcttggacttctggcctccagaactgtgaaagaataaa-a 10455 
Query: 139 ttctgttgtttaagcca 155 

IIIIMIIIIIIIIIII 

Sbjct: 10456 ttctgttgtttaagcca 10472 



SEQ ID No 10 

T32454 

ID T32454 standard; DNA; 30967 BP. 

AC T32454; 

DT lO-DEC-1996 (first entry) 

DE Calpain large subunit 1 gene. 

KW Calpain; subunit; calcium; protease; mutation; treatment; 

KW detection; identification; diagnosis; limg girdle muscular dystrophy; 

KW LGMD2; calcium activated neutral protease; CANP; ss. 

OS Homo sapiens. 

FH Key Location/Qualifiers 

FT exon 1367. .1674 

FT /*tag= a 

FT /label= Exon 1. 

FT intron 1675. .3689 

FT /*tag= b 

FT /label= Intron 1. 

FT exon 3690. .3759 

FT /*tag= c 

FT /label= Exon 2. 

FT intron 3760. .5390 

FT /*tag= d 

FT /label= Intron 2. 

FT exon 5391. .5509 

FT /*tag= e 

FT /label= Exon 3. 

FT intron 5510. .7015 

FT /*tag= f 

FT /label= Intron 3. 

FT exon 7016. .7050 

FT /*tag= g 

FT /label= Exon 4. 

FT intron 7051. .8128 

FT /*tag= h 

FT /label= Intron 4. 

FT exon 8129. .8297 

FT /*tag= i 

FT /label= Exon 5. 

FT intron 8298. .8889 

FT /*tag= j 

FT /label= Intron 5. 

FT exon 8890. .9297 

FT /*tag= k 

FT /label= Exon 6. 

FT intron 9298, .11843 

FT /*tag= 1 

FT /label= Intron 6". 

FT exon 11844. .11927 

FT /*tag= m 

FT /label= Exon 7. 

FT intron 11928. .13458 

FT /*tag= n ^ 

FT /label= Intron 7- 

FT exon 13459. .13545 

FT /*tag= o 

FT /label= Exon 8. 

FT intron 13456. .15026 

FT /*tag= p 

FT /label= Intron 8. 



FT exon 15027. .16104 

FT /*tag= q 

FT /label= Exon 9. 

FT intron 16105. .17284 

FT /*tag= r 

FT /label= Intron 9. 

FT exon 17285. .17445 

FT /*tag= s 

FT /label= Exon 10. 

FT intron 17446. .19448 

FT /*tag= t 

FT /label= Intron 10 

FT exon 19449. .19618 

FT /*tag= u 

FT /label= Exon 11. 

FT intron 19619. .19929 

FT /*tag= v 

FT /label= Intron 11 

FT exon 19930. .19941 

FT /*tag= w 

FT /label= Exon 12. 

FT intron 19942. .20604 

FT /*tag= x 

FT /label= Intron 12 

FT exon 20605. .20813 

FT /*tag= y 

FT /label= Exon 13. 

FT intron 20814. .21551 

FT /*tag= z 

FT /label= Intron 13 

FT exon 21552. .21558 

FT /*tag= aa 

FT /labels Exon 14. 

FT intron 21589. .23746 

FT /*tag= ab 

FT /label== Intron 14 

FT exon 23747. .23764 

FT /*tag= ac 

FT /label= Exon 15. 

FT intron 23765. .26027 

FT /*tag= ad 

FT /label= Intron 15 

FT exon 26028. .26141 

FT /*tag= ae 

FT /label= Exon 16. 

FT intron 26142. .27119 

FT /*tag= af 

FT /label= Intron 16 

FT exon 27120. .27197 

FT /*tag= ag 

FT /label= Exon 17. 

FT intron 27198. .27602 

FT /*tag= ah 

FT /label= Intron 17 

FT exon 27603. .27660 

FT /*tag= ai 

FT /label= Exon 18. 

FT intron 27661. .27747 

FT /*tag= aj 

FT /label= Intron 18 
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FT 


intron 




o o yi o /I 
z o4o4 . 




FT 






/ *tag- 


ap 


FT 






/ label= 


Intron 21 


FT 


exon 




28699. , 


,28815 


FT 






/*tag- 


aq 


FT 






/label= 


Exon 22. 


FT 


intron 




28816 . . 


,29101 


FT 






/ *tag- 


ar 


FT 






/ label= 


Intron 22 


FT 


exon 




z yiuz . 


,2 9^60 


FT 






/ *tag- 


as 


FT 






/label= 


Exon 2 3 . 


FT 


intron 




o n o T 
Z9ZDi. . 


,2 9562 


FT 






/*tag= 


at 


FT 






/label= 


Intron 23 


FT 


exon 




o n c o 

29563. . 


o n c o n 

, 29589 


FT 






/*tag= 


au 


FT 






/label= 


Exon 24 . 


PN 


W09616175-A2. 






PD 


30-MAY- 


1996. 






PF 


21-NOV- 


1995; 


E04575. 




PR 


22-NOV- 


1994; 


EP-402668. 




PA 


(ASFR-) 


ASSOC 


: FR CONTRE 


myopathie; 


PI 


Beckmann J, 


Richard I; 




DR 


WPI; 96 


-268611/27. 




DR 


P-PSDB; 


R99579. 





PT Human novel Calpain large sub: unit 1 gene encoding a calcium 

PT dependent protease - used to develop prods, for the diagnosis and 

PT treatment of limb-girdle muscular dystrophy 2 disease 

PS Claim 1; Figure 8; 66pp; English. 

CC The calpain large subunit 1 gene located on chromosome 15 codes for 

CC a calcium activated neutral protease (CANP3) belonging to the 

CC calpain family. Mutations in the gene induce limb-girdle muscular 

CC dystrophy (LGMD) 2 disease. The gene, and fragments of it, can be 

CC used in the prevention, treatment, diagnosis and detection of a 

CC predisposition to LGMD2 disease. A cDNA version of the gene is 

CC described in T32455, 

SQ Sequence 30967 BP; 7629 A; 7648 C; 7675 G; 8015 T; 



Query Match 10.9%; Score 32; DB 1; Length 30967; 

Best Local Similarity 62.5%; Pred. No. 1.2; 

Matches 50; Conservative 0; Mismatches 30; Indels 0; Gaps 0; 



Qy 182 cttgcgcactgtgagtccctggacgagttactccacctctctgaacctcctcctcacttg 2 41 



Db 3185 CTTACTAGCTGTGTGTCTTTGCACGAGTTTCTTAACCTCTCTGGGCCTCAGTTTCCTTAT 3244 

Qy 242 cataatgggaaaaataatgg 261 

I II I I I I I I I I 
Db 3245 CTGAAAAATAACAATGATAG 3264 



RESULT 7 
N90388 

ID- N90388 standard; cDNA; 5719 BP. 

AC N90388; 

DT 20-OCT-1989 (first entry) 

DE cDNA encoding human platelet-derived growth factor receptor. 

KW cDNA; human platelet derived growth factor receptor; agonist; 

ICW antagonist; drugs; wound healing; atherosclerosis; 

KW cancer; genetic disorders; antibodies. 

OS Homo sapiens (human) 

FH Key Location/Qualifiers 

FT cds 462 

FT /*tag= a 

FT signal_peptide 462. .557 

FT /*tag= b 

FT 500. .524 

PN EP-327369-A. 

PD 09-AUG-1989. 

PF 02-FEB-1989; 301021. 

PR 02-FEB-1988; US-151414. 

PA (REGC) Univ of California. 

PI Williams L T, Escobedo J E; 

DR WPI; 89-229378/32. 

DR P-PSDB; P90646. 

PT New DNA encoding human platelet derived growth factor receptor 

PT - useful eg for assessing agonist and antagonist drugs. 

PS Claim 1; page 3; 12pp; English. 

CC cDNA encoding human platelet derived growth factor receptor (see P90646 

CC for features) . Used to make probes and antibodies, and to evaluate drugs 

SQ Sequence 5719 BP; 1266 A; 1714 C; 1548 G; 1191 T; 

Query Match 10.0%; Score 29.4; DB 1; Length 5719; 

Best Local Similarity 55.3%; Pred. No. 4,1; 

Matches 57; Conservative 0; Mismatches 46; Indels 0; Gaps 

Qy 153 tgggtccttccaggacactgacgtctcagcttgcgcactgtgagtccctggacgagttac 212 

I I I I I I I I I I I I II I II I I II I II I I I I I I 

Db 4 633 TGTGCCAGTATATGGCCCTGGCTCTGCATTGGACCTGCTATGAGGCTTTGGAGGAATCCC 4692 

Qy 213 tccacctctctgaacctcctcctcacttgcataatgggaaaaa 255 

II I I I I I I I I I I II I I I I I I I I I I I I I 

Db 4 693 TCACCCTCTCTGGGCCTCAGTTTCCCCTTCAAAAAATGAATAA 4 735 



Query = SEQ ID NO: 10 

(294 letters) 



Score E 

Sequences producing significant alignments: (bits) Value 

AC140062 ACCESSION :AC140062 NID : gi 29150317 gb AC140062.il Ho... 351 5e-94 

>AC140062 ACCESSION :AC140062 NID: gi 29150317 gb AC140062.il Homo 

sapiens 12 BAC RP13-298C8 (Roswell Park Cancer Institute 
Human BAC Library) complete sequence 
Length = 64695 

Score = 351 bits (177), Expect = 5e-94 
Identities = 180/181 (99%) 
Strand = Plus / Plus 

Query: 114 aggcactgggtaggaacacagccaagaacgattgcaggatgggtccttccaggacactga 173 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Sbjct : 1688 aggcactgggtaggaacacagccaagaacgattgcaggatgggtccttccaggacactga 1747 
Query: 174 cgtctcagcttgcgcactgtgagtccctggacgagttactccacctctctgaacctcctc 233 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbjct : 1748 cgtctcagcttgcgcactgtgagtccctggacgagttactccacctctctgaacctcctc 1807 
Query : 234 ctcacttgcataatgggaaaaataatggacatagggagatgaaacaagaccttggagacc 2 93 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii iiiiiiiiiiiiiiiiiiiiiiii 

Sbj ct : 1808 ctcacttgcataatgggaaaaataatggacataggaagatgaaacaagaccttggagacc 1867 

Query: 2 94 a 2 94 
Sbjct: 1868 a 1868 



Query = SEQ ID NO: 11 

(241 letters) 



Score E 

Sequences producing significant alignments: (bits) Value 

AC112518. 1.1. 78409 ^26 e-117 

>AC112518. 1.1. 78409 

Length = 78409 

Score = 426 bits (215) , Expect = e-117 
Identities = 232/239 (97%) 
Strand = Plus / Minus 

Query : 3 atgcct tctaaacagcctaccctgcccagngccatgattactgtgaccacat ctt cagag 62 

iiiiiiiiiiiiiiiiiiiiiiiiii II iiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbjct: 2737 atgccttctaaacagcctaccctgccaagtgccatgattac tgtgaccacatct tcagaa 2678 
Query : 63 ccagaaaacaggatacctggccctaagcatgcactcatggagcanaagagttttaaatct 122 

IIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIII lllllllllllllll 

Sbj ct : 2677 ccagaaaacaggatacctggccctaagcatgcactcatggagcagaagagt tttaaatct 2618 
Query: 12 3 gntatgccacagaagacagaagataacatgcttactacacttgtnaagcaacatgcagcc 182 

I IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIII llillllllllllll 

Sbjct : 2617 ggaatgccacagaagacagaagataacatgcttactacacttgtaaagcaacatgcagcc 2558 
Query : 183 agccatttccagtgcaaattatctcattgcatagtgtgacaactaaaggtcataaccat 241 

lllllllllllllllllllllllllllllllllllllllllillllllMIIIIIIIII 

Sbjct: 2557 agccatttccagtgcaaattatctcattgcatagtgtgacaactaaaggtcataaccat 2499 



Query^ SEQ ID NO: 12 

(197 letters) 



Score E 

Sequences producing significant alignments: (bits) Value 

AL158207 ACCESSION: AL158207 NID : gi 12717949 emb AL158207.15 H... 391 e-106 

>AL158207 ACCESSION: AL158207 NID: gi 12717949 emb AL158207.15 Human DNA 
sequence from clone RP11-4 0 9K2 0 on chromosome 9 Contains 
the TORIB gene for torsin family 1 member B (torsin B) 
(DQl) , the DYTl gene for "dystonia 1, torsion" (autosomal 
dominant; torsin A) (DQ2 , TORIA) , the gene for 
hepatocellular carcinoma-associated antigen 59 (HSPC220, 
LOC51759) , the USP20 gene for ubiquitin specific protease 
20 (KIAA1003), and the gene for f ormin-binding protein 17 
(FBP17, includes KIAA0554, FLJ13619, FLJ10754 and 
FLJ10113) . Contains ESTs, STSs, GSSs and four CpG islands, 
c 

Length = 169963 

Score = 391 bits (197), Expect = e-106 
Identities = 197/197 (100%) 
Strand = Plus / Plus 

Query: 1 acaggatgcctgtaatcattattcagtgagcagcaacctgcagcagctcctcctgactgg 60 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbjct : 137396 acaggatgcctgtaatcattattcagtgagcagcaacctgcagcagctcctcctgactgg 137455 
Query : 61 cagatgggcctggcggccacccagaggctggggacacagcaagaatccagcacagcaccg 12 0 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbj ct : 137456 cagatgggcctggcggccacccagaggctggggacacagcaagaatccagcacagcaccg 137515 
Query: 121 atcccgattccctcctccccaaactacctgagccatggacctcattttgtggacaaaatt 180 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIII 

Sbj ct : 137516 atcccgattccctcct ccccaaactacctgagccatggacctcattttgtggacaaaatt 137575 
Query: 181 aaacttgccactttcac 197 

IIIIIIIIIIIIIIIM 

Sbjct: 137576 aaacttgccactttcac 137592 



SEQ ID No 13 

V02739 

ID V02739 standard; cDNA to rtiRNA; 3223 BP. 
AC V02739; 

DT 21-JUL-1998 (first entry) 

DE S. lepidophylla trehalose-6-phosphate synthetase/phosphatase gene. 
KW Trehalose-6-phosphate synthetase/phosphatase; Resurrection plant; cold; 
KW microphyll; dehydration; probe; hybridisation; E. coli; yeast; heat; ds; 
KW constitutive expression; transgenic plant; salinity; drought; enzyme; 
KW food; hormone; vaccine; preservative. 
OS Selaginella lepidophylla. 



FH Key Location/Quali f iers 

FT CDS 111. .3095 

FT /*tag= a 

FT /product= "trehalose-6-phosphate synthase/phosphatase" 



PN W09742327-A2. 

PD 13-NOV-1997 . 

PF 06-MAY-1997; , MX0012 . 

PR 08-MAY-1996; MX-001719. 

PA (UYME-) UN IV MEXICO NACIONAL AUTONOMA. 

PI Iturriaga de la Fuente G, Zentella Gomez R; 

DR WPI; 98-008448/01. 

DR P-PSDB; W44844. 

PT Selaginella lepidophylla trehalose-6-phosphate 

PT synthetase/phosphatase - useful for conferring thermo- and 

PT osmo-tolerance on plants and for producing trehalose as a 

PT preservative 

PS Claim 1; Page 30-34; 53pp; Spanish. 

CC This nucleotide sequence encodes the bifunctional trehalose-6-phosphate 

CC synthetase/phosphatase (TPS/P) enzyme from Selaginella lepidophylla (the 

CC 'Resurrection* plant). The sequence was isolated from a cDNA library 

CC constructed from mRNA purified from S. lepidophylla microphylls which 

CC had been dehydrated for 2.5 hr. The probes for hybridisation of the 

CC library were derived from consensus sequences between E. coli and yeast 

CC TPS/P. The gene is constitutively expressed in S. lepidophylla, so 

CC when used to generate transgenic plants does not require additional 

CC regulatory sequences. Increased production of trehalose in plants 

CC increases tolerance to environmental stresses, especially to extremes 

CC of heat or cold, to salinity and drought. The transgenic plants can grow 

CC in otherwise unfavourable conditions or can be grown using less water 

CC than usual. Transformed microorganisms are useful for producing large 

CC quantities of trehalose for use as preservatives for food and for 

CC enzymes, hormones, vaccines and other pharmaceutical proteins. 

SQ Sequence 3223 BP; 789 A; 779 C; 959 G; 696 T; 



Query Match 8.5%; Score 32.8; DB 1; Length 3223; 

Best Local Similarity 61.9%; Pred. No. 0.33; 

Matches 52; Conservative 0; Mismatches 32; Indels 0; Gaps 0; 

Qy 166 tctggtcaccaatttcacagcctggacagagcaagaaggtgcggctggcttaggaggcgg 225 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2695 TGTCGTCACCAAAGTCACCCGGCCGTGGAAGCGAGCAGCAGCAGCAAGCTGAGGAGGCTVA 2754 



Qy 226 cctgccgggggggatcgtctgtcc 249 

II III I II II I II II I 
Db 2755 GCAGATGGGAAGGATCGTCCGTGC 277 8 



RESULT 1 
V89805/C 

ID V89805 standard; cDNA; 265 BP. 

AC V89805; 

DT 15-FEB-1999 (first entry) 

DE EST clone CJ498. 

KW Human; secreted protein; expressed sequence tag; EST; haematopoiesis ; 

KW tissue growth; activin; inhibin; chemotaxis; chemokinesis ; haemostatic; 

KW receptor; ligand; thrombolytic; anti-inflammatory; cadherin; anti-tumour; 

KW gene therapy; ss. 

OS Homo sapiens. 

PN W09845436-A2. 

PD 15-OCT-1998. 

PF lO-APR-1998; U06955. 

PR lO-APR-1997; US-838821. 

PA (GEMY ) GENETICS INST INC. 

PI Agostino MJ, Jacobs K, Lavallie ER, McCoy JM, Merberg D, 

PI Racie LA, Spaulding V, Treacy M; 

DR WPI; 99-070077/06. 

PT New polynucleotides encoding human secreted proteins - derived from 

PT e.g. human blood, kidney, foetal lung, placenta, testes, brain, 

PT ovary, pituitary, retina and colon cDNA libraries. 

PS Claim 1; Page 336; 618pp; English. 

CC The present sequence represents a human expressed sequence tag (EST) . 

CC The polynucleotide, which is a secreted EST, and the encoded protein 

CC are predicted to have useful biological activities which would make 

CC them suitable for treating, preventing or ameliorating medical 

CC conditions in humans and animals, although no supporting data is 

CC given. Suggested activities include nutritional activity, immune 

CC stimulating or suppressing activity, haematopoiesis regulating 

CC activity, tissue growth activity, activin/inhibin activity, 

CC chemotactic/chemokinetic activity, haemostatic and thrombolytic 

CC activity, receptor/ligand activity, anti-inflammatory activity, 

CC cadherin/tumour invasion suppressor activity, tumour inhibition 

CC activity. The polynucleotide may also be useful for gene therapy. 

SQ Sequence 265 BP; 62 A; 59 C; 59 G; 85 T; 



Query Match 10.3%; Score 33.6; DB 1; Length 265; 

Best Local Similarity 61.4%; Pred. No. 0.1; 

Matches 51; Conservative 0; Mismatches 32; Indels 0; Gaps 0; 

Qy 14 ctcagcagacnaaccacagcttcctgccctttgcagatggcntgaanataagagtttgcc 73 

I I I I I I I II III I I I I I I I II I I I I I II I II I II I 

Db 110 CACTGCAGAAAAATTGAAGTTGAATCCCCTCTGCACACAGACAGGGAATAAAAGTTTGAC 51 

Qy 74 aaacaactaagatgggctcttga 96 

I I I I I I I I I I I I I II I 

Db 50 AAACACCCAAGAAAAACTCTTTA 28 



RESULT 1 
B17653/C 

LOCUS B17653 548 bp DNA GSS 04-JUN-1998 

DEFINITION 347F4.TPB CIT978SKA1 Homo sapiens genomic clone A-347F04, 

genomic survey sequence. 
ACCESSION B17653 



VERSION B17653.1 GI:2125402 

KEYWORDS GSS. 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Chordata; Craniata; Vertebrata; Euteleostomi ; 
Primates; Catarrhini; Hominidae; Homo. 



BASE COUNT 
ORIGIN 



human . 

Homo sapiens 
Eukaryota; Metazoa; 
Mammalia ; Eutheria ; 
1 (bases 1 to 548) 

Adams, M.D., Kelley,J.M., Rounsley, S . R. and Venter, J. C. 

Use of a BAC End Sequence Database for Sequence-Ready Map Building 

Unpublished (1997) 

On Dec 15, 1999 this sequence version replaced gi: 4575579. 

Other_GSSs: 347F04 .TVB 

Contact : Mark Adams 

Department of Eukaryotic Genomics 

The Institute for Genomic Research 

9712 Medical Center Dr., Rockville, MD 20850, USA 
Tel: 301 838 0200 
Fax: 301 838 0208 
Email : mdadams @t igr . org 

Clones are available from Research Genetics (info@resgen.com). BAC 
end search page: 

http : //www. tigr . org/tdb/humgen/bac_end_search/bac_end_search . html 
Seq primer: SP6 
Class: BAC ends. 

Location/Qualifiers 

1. .548 

/organism="Homo sapiens" 
/db_xref="taxon: 9606" 
/clone="A-347F04" 
/clone_lib="CIT97 8 SKAl " 
/sex="Female" 
/cell_type=" Fibroblast" 

/note="Vector : pBAC108L; Site_l: Hindlll; Site_2: Hindlll; 
CalTech Human BAC Library Al" 
153 a 109 c 102 g 184 t 



Query Match 85.8%; Score 279.6; DB 120; Length 548; 

Best Local Similarity 90.5%; Pred. No. 2.2e-73; 

Matches 296; Conservative 0; Mismatches 30; Indels 1; Gaps 1; 

Qy 1 ggacagtggctaactcagcagacnaaccacagcttcctgccctttgcagatggcntgaan 60 

I I I I I I I I I I I I I I I I M I M I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I 
Db 512 GGACAGTGGCTAACTCAGCAGACGAACCAGAGCTTCATGCCCTTTGCAGATGGCATGAAG 453 

Qy 61 ataagagtttgccaaacaactaagatgggctcttgattgagcaaanaaaccacaacatgg 120 

I I I I I I I I I I I I I M I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 452 ATAAGAGTTTGCCAAACAACTAAGATGGGCTCTTGATTGAGCAAAGAAACCACAACATGG 393 

Qy 121 gacacacagagccaccctattgncctactgtcattcaagcttaaaggagacatatctaca 180 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 392 GACACACAGAGCCACCTAATTGCCATACTGTCATTCAAGCTTAAAGGAGACATATCTACA 333 

Qy 181 gacagggtttgagcctagtnatggnganaactttcttggatgtctcaacancctgganat 240 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II II I I I I II I I I I I I II M M 

Db 332 GACAGGGTTTGAGCATAGTAATGGTGAGAACTTTCTTGGATGTCTCAACAGCCTGGAGAT 273 

Qy 241 gannntcccnacaaggcagaanancnaggtggnacattgntnntattgctttttatt-ca 299 

II I I II I I I II I II I I I I I I I I I I I I II I I I II I I I I I I I II I I I 

Db 272 GAAATTCCCAAGAAGGCAGAAAATAGAGGTGGCACATTGGTTTTATTGTTTTTTATTACA 213 



Qy 300 attataaaagtaatgcatgctttttgt 326 

I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 212 ATTATAAAAGTAATGCATGCTTTTTGT 186 



Query= SEQ ID NO: 13 

(387 letters) 



Score E 

Sequences producing significant alignments: (bits) Value 

AL161936. 15. 1.155584 753 0.0 

>AL161936 . 15 . 1 . 1555 84 

Length = 155584 

Score = 753 bits (380), Expect =0.0 
Identities = 385/387 (99%) 
Strand = Plus / Plus 

Query : 1 tggtgcttactaaaaattgaataancgtggaaaagagaaaatctccctctttaaaaggaa 60 

I I I I I I I I I I I I I I I I I I I I I. I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Sbjct : 144402 tggtgcttactaaaaattgaataaacgtggaaaagagaaaatctccctctttaaaaggaa 144461 
Query : 61 cactgttgtggacattttaaaatgcaaacgccttggctggaagtcagaaatcgtgttctc 120 

IIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIillllll 

Sbjct : 144462 cactgt tgtggacattttaaaatgcaaacgcc ttggctggaagtcagaaatcgtgttctc 144521 
Query : 12 1 tctgctaaacctggtgtagcatttaacacgcttgaagtggaggcatctggtcaccaattt 180 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbjct : 144522 tctgctaaacctggtgtagcatttaacacgcttgaagtggaggcatctggtcaccaattt 144581 
Query : 181 cacagcctggacagagcaagaaggtgcggctggcttaggaggcggcctgccgggggggat 240 

lllllllllllllllllllllllllllllllll IIIIIIIIIIIIIIIIIIIIIIMII 

Sbjct : 144582 cacagcctggacagagcaagaaggtgcggctggtttaggaggcggcctgccgggggggat 144641 
Query : 24 1 cgtctgtccatctgggcttggtaaatgtcaagggtcatttccctgtcctgacatttgatt 3 00 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbjct : 144 642 cgtctgt ccatctgggcttggtaaatgt caagggtcatt tccc tgtcctgacatttgatt 1447 01 
Query: 301 gtgaagcaggttgcgaggtaactctttcaagggactggactgtgacagtcaccatagttg 360 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbjct : 144702 gtgaagcaggttgcgaggtaactctttcaagggactggactgtgacagtcaccatagt tg 144761 
Query: 361 gacaataaaacccgaacatccttcacc 387 

IIIIIIIIIIIIIMIIIIIIIIIIII 

Sbjct: 144762 gacaataaaacccgaacatcct tcacc 144788 



Query= SEQ ID NO: 14 

(326 letters) 



Score E 

Sequences producing significant alignments: (bits) Value 

AC092768 ACCESSION :AC092768 NID : gi 18182777 gb AC092768.6 Horn... 466 e-128 

>AC092768 ACCESSION: AC092768 NID: gi 18182777 gb AC092768.6 Homo 
sapiens chromosome 11, clone RP11-1149L18 , complete 
sequence 
Length = 146364 

Score = 466 bits (235) , Expect = e-128 
Identities = 301/327 (92%), Gaps = 1/327 (0%) 
Strand = Plus / Minus 

Query: 1 ggacagtggctaactcagcagacnaaccacagcttcctgccctttgcagatggcntgaan 60 

lllllllllllllllllllllll Mill llllllllllllllllllllllll Mil 

Sbj ct : 8644 ggacagtggctaactcagcagacgaaccagagcttcctgccctttgcagatggcatgaag 8585 
Query: 61 ataagagtttgccaaacaactaagatgggctcttgattgagcaaanaaaccacaacatgg 120 

IIIIIIIIMIIIMIIMIIIIIIIIIIIIIIIIIIIIIIIIII IIIIIIMIIIIII 

Sbjct : 8584 ataagagtttgccaaacaactaagatgggctcttgattgagcaaagaaaccacaacatgg 8525 
Query: 121 gacacacagagccaccctattgncctactgtcattcaagcttaaaggagacatatctaca 180 

IMIIIIIIIIIIIIIMIIIi llllillllMIMIIIIIIIMIMIIIIIIIIIII 

Sbjct : 8524 gacacacagagccaccctattgccctactgtcattcaagct taaaggagacatatc taca 8465 
Query: 181 gacagggtttgagcctagtnatggnganaactttcttggatgtctcaacancctgganat 24 0 

IIIMIIIMIIIIMIII MM II MIIIIIIIIIIIIIIMIMI llllll II 

Sbjct : 8464 gacagggtt tgagcct agtaatggtgagaactttc t tggatgtctcaacagcctggagat 8405 
Query: 241 gannntcccnacaaggcagaanancnaggtggnacattgntnntattgctttttatt-ca 2 99 

II MM I lllllllll I llllll MUM I Mill llllllll II 

Sbj ct : 8404 gaaattcccaagaaggcagaaaatagaggtggcacattggttttattgttttttattaca 8345 
Query: 300 attataaaagtaatgcatgctttttgt 326 

iiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbjct: 8344 attataaaagtaatgcatgctttttgt 8318 



SEQ ID No 15 

Q91200/C 

ID Q91200 standard; cDNA; 3320 BP. 

AC Q91200; 

DT ll-DEC-1995 (first entry) 

DE H-NUC retinoblastoma protein binding protein. 

KW H-NUC; tumour suppressor; retinoblastoma binding protein; 

?CW therapeutic; gene therapy; ss. 

OS Homo sapiens. 

FH Key Location/Qualifiers 

FT cds 101. .2576 

FT /*tag= a 

PN W09517198-A1. 

PD 29-JUN-1995. 

PF 20-DEC-1994; U14813. 

PR 20-DEC-1993; US-170586. 

PA (TEXA ) UNIV TEXAS SYSTEM. 

PI Chen P, Lee W; 

DR WPI; 95-240467/31. 

DR P-PSDB; R75848. 

PT DNA encoding a retinoblastoma protein binding protein ^ used in the 

PT gene therapy of cancers, esp. breast cancer. 

PS Claim 4; Fig 3A-3I; 85pp; English. 

CC The H-NUC DNA and protein encoded by it may be used to suppress the 

CC neoplastic phenotype of a cancer cell which lacks endogenous H-NUC 

CC protein. The DNA and protein inhibit cancer, especially mamma 

CC carcinoma, cell division and proliferation. A retro virus vector or 

CC adeno virus vector (AC-H-NUC) may be used for the ex vivo gene 

CC therapy of cancer, where the H-NUC gene is transferred to abnormally 

CC proliferating cells; gene expression in sufficient amounts 

CC suppresses proliferation of those cells. The cells are then 

CC returned to the original mammal. 

SQ Sequence 3320 BP; 1049 A; 674 .C; 662 G; 935 T; 



Query Match 16.4%; Score 27.2; DB 1; Length 3320; 

Best Local Similarity 72.9%; Pred. No. 6.4; 

Matches 35; Conservative 0; Mismatches 13; Indels 0; Gaps 

Qy 106 tcttcaagtgcttgttaaggccatttgtctatttcactctcaagtaaa 153 

11111111111111111 I I I I I I I I I I I I I I I I I I 
Db 2283 TCTTCAAGTTCTTGTAAAGCAGACTTATATTTTTCATTTGCAAATAAA 2236 



RESULT 7 
Q59800 

ID Q59800 standard; cDNA; 372 BP. 

AC Q59800; 

DT 16-MAR-1994 (first entry) 

DE Human brain Expressed Sequence Tag EST00733. 

KW Gene transcription product; genetic markers; tagging; in vivo; 

KW transcription; mapping; locations; chromosomes; chromosomal; ss. 

OS Homo sapiens. 

PN W09316178-A. 

PD 19-AUG-1993. 

PF 12-FEB-1993; U01294. 

PR 12-FEB-1992; US-837195. 

PA (USSH ) US DEPT HEALTH & HUMAN SERVICE. 

PI Adams MD, Moreno RF, Venter CJ; 



DR WPI; 93-272882/34. 

PT Enriched oligonucleotides and corresp. sequences - used as 

FT markers for human genes transcribed in-vivo, facilitate tagging 

PT of most human genes 

PS Example 4; Page 233; 500pp; English. 

CC The Expressed Sequence Tag was isolated from a human brain cDNA 

CC library as part of a large set of ESTs which can be used as markers 

CC for human genes transcribed in vivo. They can be used to facilitate 

CC tagging of most human genes, for mapping locations of expressed genes 

CC on chromosomes, for individual or forensic identification, for mapping 

CC locations of disease-associated genes, for identification of tissue 

CC type, and for prepn. of antisense sequences, probes and constructs. 

CC EST00733 has a "poor" coding probability as evaluated using the 

CC coding-region prediction program CRM. See also Q59041-Q61440 . 

SQ Sequence 372 BP; 87 A; 93 C; 55 G; 134 T; 



Query Match 16.3%; Score 27; DB 1; Length 372; 

Best Local Similarity 66.1%; Pred. No. 2.9; 

Matches 39; Conservative 0; Mismatches 20; Indels 0; Gaps 



Qy 108 ttcaagtgcttgttaaggccatttgtctatttcactctcaagtaaataaaaatattttt 166 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I II I I I 
Db 151 TTCTACTGCTTGTTCAATACATCTCTCTATGTAAATCTTGACTCCATAATGAGGTTTTT 209 



Query= SEQ ID NO: 15 

(166 letters) 



Score E 

Sequences producing significant alignments : (bits) Value 

AC008115. 3. 1.158431 32^.' 7e-86 

>AC008115. 3. 1.158431 

Length = 158431 

Score = 321 bits (162) , Expect = 7e-86 
Identities = 165/166 (99%) 
Strand = Plus / Minus 

Query: 1 tcagtatcctgacctggcaaggtgttccttaacctcccctctggatcccccttagcacac 60 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbj ct : 43 020 tcagtatcc tgacctggcaaggtgttccttaacct cccctctggatcccccttagcacac 42961 
Query: 61 atctgggacaatggagcgttcagcaccacggacagcattacaccctcttcaagtgcttgt 12 0 

IIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 

Sbjct: 42960 atctgggacaatggagcgttcagcaccacggacagcat tacaccctcttcaagtgcttgt 42901 



Query: 121 taaggccatttgtctatttcactctcaagtaaataaaaatattttt 166 

III IIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIII 

Sbjct: 42900 taaagccat t tgtc tatttcactctcaagtaaataaaaatattttt 42855 



SEQ ID NO 16 

HSAC002043/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



Chordata; Craniata; Vertebrata; Mammalia; 
Catarrhini; Hominidae; Homo, 



9712 



HSAC002043 226841 bp DNA HTG 30-APR-1997 

Homo sapiens clone 381E11, SEQUENCING IN PROGRESS 4 

unordered pieces . 
AC002043 

AC002043.1 GI:2062147 
HTG; HTGS_PHASE1. 
human . 

Homo sapiens 
Eukaryota; Metazoa; 
Eutheria; Primates ; 

1 (bases 1 to 226841) 

Adams, M.D., Loftus,B,J., Zhou,L., Phillips, C, Brandon, R. C . , 
Fuhrmann,J,, Kim,U.J., Kerlavage, A. R. and Venter, J. C. 
Human chromosome 16pl3 BAC clone CIT987SK-38 lEll 
Unpublished 

2 (bases 1 to 226841) 
Adams, M.D. and Loftus,B,J. 
Direct Submission 

Submitted (29-APR-1997 ) The Institute for Genomic Research, 
Medical Center Dr., Rockville, MD 20850, USA 

3 (bases 1 to 226841) 
Adams , M . D . 

Direct Submission 

Submitted (30-APR-1997 ) The Institute for Genomic Research, 9712 
Medical Center Dr, Rockville, MD 20850, USA 
* NOTE: This is a 'working draft* sequence. It currently 
consists of 4 contigs . The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are represented as 
runs of N, but the exact sizes of the gaps are unknown. 
This record will be' updated with the finished sequence 
as soon as it is available and the accession number will 
be preserved. 

contig of 2951 bp in length 
gap of unknown length 
contig of 11520 bp in length 
gap of unknown length 
contig of 120845 bp in length 
gap of unknown length 
contig of 91525 bp in length. 



2952 



2951: 



14471: 



14472 135316: 



FEATURES 

source 



BASE COUNT 
ORIGIN 



135317 226841: 

Location/Qualifiers 
1, .226841 

/organism="Homo sapiens" 
/db_xref="taxon: 9606" 
/clone="381Ell" 
61420 a 49870 c 51181 g 64328 



42 others 



Query Match 9.0%; 
Best Local Similarity 87.3%; 
Matches 62; Conservative 



Score 57.2; DB 54; 
Pred. No. le-07; 
0 ; Mismatches 9 ; 



Length 226841; 
Indels 0 ; Gaps 



0; 



Qy 568 agtagagacaagttttcgccatgttggtcaagctggtctcaaacttctaacctnacgtaa 627 

I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I M I I I M I II I I I I I I I I I 
Db 135306 AGTAGAGACAAGGTTTCGCCATGTTGGTGAAGCTGGTCTCAAACCCCTGACCTCAGGTAA 135247 



Qy 628 tccaccccgct 638 

.1 I I I I I I II 
Db 135246 TCCACCCGCCT 135236 



RESULT 3 
Q60651 

ID Q60651 standard; cDNA; 344 BP. 

AC Q60651; 

DT 16-MAR-1994 (first entry) 

DE Human brain Expressed Sequence Tag EST02665. 

KW Gene transcription product; genetic markers; tagging; in vivo; 

KW transcription; mapping; locations; chromosomes; chromosomal; ss. 

OS Homo sapiens. 

PN W09316178-A. 

PD 19-AUG-1993. 

PF 12-FEB-1993; U01294. 

PR 12-FEB-1992; US-837195. 

PA (USSH ) US DEPT HEALTH & HUMAN SERVICE. 

PI Adams MD, Moreno RF, Venter CJ; 

DR WPI; 93-272882/34. 

PT Enriched oligonucleotides and corresp. sequences - used as 

PT markers for human genes transcribed in-vivo, facilitate tagging 

PT of most human genes ' 

PS Example 4; Page 368; 500pp; English. 

CC The Expressed Sequence Tag was isolated from a human brain cDNA 

CC library as part of a large set of ESTs which can be used as markers 

CC for human genes transcribed in vivo. They can be used to facilitate 

CC tagging of most human genes, for mapping locations of expressed genes 

CC on chromosomes, for individual or forensic identification, for mapping 

CC locations of disease-associated genes, for identification of tissue 

CC type, and for prepn. of antisense sequences, probes and constructs. 

CC EST02665 has a "poor" coding probability as evaluated using the 

CC coding-region prediction program CRM. See also Q5904 1-Q61440 . 

SQ Sequence 344 BP; 78 A; 87 C; 79 G; 95 T; 



Query Match 8.0%; 
Best Local Similarity 81.7%; 
Matches 58; Conservative 



Score 50.8; DB 1; 
Pred. No. 6.5e-07; 
0; Mismatches 13; 



Length 344; 



Indels 



0 ; Gaps 



0; 



Qy 568 agtagagacaagttttcgccatgttggtcaagctggtctcaaacttctaacctnacgtaa 627 

II II M II II I I I I I I I I I I I I I II II I I I I II I I MM II II II I II I 
Db 98 AGTAGAGACAGGGTTTCGCCATGTTGGCCAGGCTGGTCTTGAACTCCTGACCTCAGGTGA 157 

Qy 628 tccaccccgct 638 

I I I I II I I I 
Db 158 TCCACCCACCT 168 



RESULT 1 

H69406 

LOCUS 

DEFINITION 



ACCESSION 

VERSION 

KEYWORDS 



H69406 274 bp mRNA EST 24-OCT-1995 

yr87f02.rl Scares fetal liver spleen INFLS Homo sapiens cDNA clone 
IMAGE: 212283 5* similar to contains Alu repetitive element;, mRNA 
sequence . 
H69406 

H69406. 1 GI:1039612 
EST. 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
MEDLINE 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 274) 

Hillier,L., Lennon,G., Becker, M. , Bonaldo, M. F. , Chiapelli , B . , 
Chissoe,S., Dietrich, N., DuBuque,T., Favello,A., Gish,W., 
Hawkins, M., Hultman,M., Kucaba,T., Lacy,M., Le,M., Le,N., 
Mardis,E., Moore, B., Morris, M., Parsons^ J., Prange,C., Rifkin,L., 
Rohlfing,T., Schellenberg, K. , Soares,M.B., Tan,F., Thierry-Meg, J . , 
Trevaskis , E, , Underwood, K. , Wohldmann, P . , Waterston, R, , Wilson, R. 
and Marra,M. 

Generation and analysis of 280,000 human expressed sequence tags 

Genome Res. 6 (9), 807-828 (1996) 

97044478 

On May 7, 1998 this sequence version replaced gi: 3119472, 
Contact: Wilson. RK 

Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St, Louis, MO 63108 

Tel: 314 286 1800 

Fax: 314 286 1810 

Email: est@watson.wustl.edu 

Insert Size: 1750 

High quality sequence stops: 214 

Source: IMAGE Consortium, LLNL 

This clone is available royalty-free through LLNL ; contact the 

IMAGE Consortium (inf oQimage . llnl . gov) for further information. 

Insert Length: 1750 Std Error: 0.00 

Seq primer: M13RP1 

High quality sequence stop: 214, 

Location/Qualifiers 

1, .274 

/organism="Homo sapiens'* 
/db_xref="GDB: 3785124" 
/db_xref="taxon: 9606" 
/clone="IMAGE: 212283" 

/clone_lib="Soares fetal liver spleen INFLS" 
/sex="male" 

/dev_stage="20 week-post conception fetus" 
/lab_host="DH10B (ampicillin resistant)" 

/note="Organ: Liver and Spleen; Vector: pT7T3D (Pharmacia) 
with a modified polylinker; Site_l: Pac I; Site_2 : Eco RI ; 
1st strand cDNA was primed with a Pac I - oligo(dT) primer 
[ 5 ' AACTGGAAGAATTAATTAAAGATCTTTTTTTTTTTTTTTTTTT 3 * ] / 
double-stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia), digested with Pac I and cloned into the Pac I 
and Eco RI sites of the modified pT7T3 vector. Library 
went through one round of normalization. Library 
constructed by Bento Soares and M.Fatima Bonaldo." 
53 a 82 c 66 g 73 t 



Query Match 8.9%; Score 56,6; DB 86; Length 274; 

Best Local Similarity 86.1%; Pred. No, 1.8e-05; 

Matches 62; Conservative 0; Mismatches 10; Indels 0; Gaps 0; 



Qy 

Db 



567 cagtagagacaagttttcgccatgttggtcaagctggtctcaaacttctaacctnacgta 62 6 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I II 
167 CAGTAGAGACAGGGTTTCGCCATGTTGGTCAGGCTGGTCTCAAACTCCTGACCTCAGGTG 22 6 



Qy 627 atccaccccgct 638 
M I I I I I I II 

Db 227 ATCCACCCGCCT 238 



RESULT 14 

HSAC002043/( 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



HSAC002043 226841 bp DNA HTG 30-APR-1997 

Homo sapiens clone 381E11, SEQUENCING IN PROGRESS 4 

unordered pieces. 
AC002043 

AC002043.1 GI:2062147 
HTG; HTGS_PHASE1. 
human , 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia; 
Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 226841) 

Adams,M.D., Loftus,B.J., Zhou,L., Phillips, C, Brandon, R. C . , 
Fuhrmann,J., Kim,U.J., Kerlavage, A. R. and Venter, J. C. 
Human chromosome 16pl3 BAC clone CIT987SK-381E11 
Unpublished 

2 (bases 1 to 226841) 
Adams, M. D. and Loftus, B. J. 
Direct Submission 

Submitted (29-APR-1997 ) The Institute for Genomic Research, 9712 
Medical Center Dr., Rockville, MD 20850, USA 

3 (bases 1 to 226841) 
Adams , M . D . 

Direct Submission 

Submitted (30-APR-1997) The Institute ' for Genomic Research, 9712 
Medical Center Dr, Rockville, MD 20850, USA 

* NOTE: This is a 'working draft' sequence. It currently 

* consists of 4 contigs , The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown . 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 

* 1 2951: contig of 2951 bp in length 

* gap of unknown length 

* 2952 14471: contig of 11520 bp in length 

* gap of unknown length 

* 14472 135316: contig of 120845 bp in length 

* gap of unknown length 

* 135317 226841: contig of 91525 bp in length. 

Location/Qualifiers 
1. .226841 

/organism="Homo sapiens" 
/db_xref="taxon : 9606" 
/clone="381Ell" 
61420 a 49870 c 51181 g 64328 t 



42 others 



Query Match 9.0%; Score 57.2; DB 54; Length 226841; 

Best Local Similarity 87.3%; Pred. No. le-07; 

Matches 62; Conservative 0; Mismatches 9; Indels 0; Gaps 



0; 



Qy 568 agtagagacaagttttcgccatgttggtcaagctggtctcaaacttctaacctnacgtaa 627 

I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I II Mil I II I I 
Db 135i306 AGTAGAGACAAGGTTTCGCCATGTTGGTGAAGCTGGTCTCAAACCCCTGACCTCAGGTAA 135247 

Qy 628 tccaccccgct 638 

I I I I I I I II 
Db 135246 TCCACCCGCCT 135236 



RESULT 3 
AI446293/C 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AI446293 476 bp rtiRNA EST 13-APR-1999 

tj31f05.xl NCI_CGAP_Panl Homo sapiens cDNA clone IMAGE: 2143137 3' 
similar to contains element PTR7 repetitive element mRNA 
sequence . 
AI446293 

AI446293.1 GI:4294194 

EST, 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia ; Eutheria ; Primates ; Catarrhini ; Hominidae ; Homo . 
1 (bases 1 to 476) 

NCI-CGAP http : //www. ncbi . nlm.nih.gov/ncicgap. 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 
Tumor Gene Index 
Unpublished (1997) 

On Apr 1, 1998 this sequence version replaced gi: 3034498. 

Contact: Robert Strausberg, Ph.D. 

Tel: (301) 496-1550 

Email : Robert_Strausberg@nih . gov 

Life Technologies catalog #: 11548-013 
DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can b 

found through the I.M.A.G.E. Consortium/LLNL at: 

www-bio . llnl . gov/bbrp/image/ image . html 

Insert Length: 1090 Std Error: 0.00 

Seq primer: -40UP from Gibco 

High quality sequence stop : 4 33. 
Location/ Quali f iers 
1. .476 

/organism="Homo sapiens" 
/db_xref="taxon:9606" 
/clone="IMAGE: 2143137" 
/clone_lib="NCI_CGAP_Panl" 
/ tissue_type="adenocarcinoma " 
/lab_host="DH10B" 

/note="Organ; pancreas; Vector: pCMV-SP0RT6; Site_l: Sail 
Site_2: NotI; Cloned unidirectionally . Primer: Oligo dT . 
Average insert size 1.72 kb. Life Technologies catalog #: 
11548-013" 
157 a 79 c 82 g 158 t 



Query Match 70.3%; Score 72.4; DB 39; Length 476; 

Best Local Similarity 82.8%; Pred. No. 7e-10; 

Matches 82; Conservative 0; Mismatches 17; Indels 0; Gaps 0 

Qy 4 ttctccaagctactcagaagactgaagcagaaggatcacttgaggccaggagttcaagat 63 

I I II III MINI M II M I I I I I I II I I I I II I II II I II I I I I I I 
Db 304 TACCCCCAGCCACTCAGGAGGCTGATGCAAGAGGATCGCTTGAGCCCAGGAGTTCAAGTC 245 

Qy 64 cagcctgagcaacatagngaaaccctatctctaaaaata 102 

I I II II I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 244 CAGCCTAAGCAACATAGTGAAACCCCATCTCCAAAAATA 206 



RESULT 7 
AA149238/C 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
MEDLINE 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AA149238 373 bp mRNA EST lO-DEC-1996 

zo38hl2.sl Stratagene endothelial cell 937223 Homo sapiens cDNA 
clone IMAGE: 589223 3* similar to contains Alu repetitive 
element ; contains element PTR7 repetitive element mRNA sequence. 
AA149238 

AA149238.1 GI:1719832 

EST. 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 373) 

Hillier,L., Lennon,G., Becker, M., Bonaldo^M. F. , Chiapelli , B . , 
Chissoe,S., Dietrich, N., DuBuque,T., Favello,A., Gish,W., 
Hawkins, M., Hultman,M., Kucaba,T., Lacy,M., Le,M., Le,N., 
Mardis,E., Moore, B., Morris, M., Parsons, J., Prange,C., Rifkin,L., 
Rohlfing,T., Schellenberg, K . , Scares, M.B,, Tan,F., Thierry-Meg, J . , 
Trevaskis , E . , Underwood, K. , Wohldmann, P . , Waterston, R. , Wilson, R. 
and Marra,M. 

Generation and analysis of 280,000 human expressed sequence tags 

Genome Res. 6 (9), 807-828 (1996) 

97044478 

Contact: Wilson RK 

Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108 

Tel: 314 286 1800 

Fax: 314 286 1810 

Email: est@watson.wustl.edu 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
Seq primer: -40M13 fwd. from Amersham 
High quality sequence stop: 337, 

Location/Qualifiers 

1. .373 

/organism="Homo sapiens" 
/db_xref="GDB: 4626963" 
/db_xref="taxon: 9606" 
/clone="IMAGE: 589223" 

/clone_lib=="Stratagene endothelial cell 937223" 
/dev_stage="umbilical vein, 1 passage" 
/lab_host="SOLR (kanamycin resistant) " 

/note="Vector : pBluescript SK-; Site_l: EcoRI ; Site_2: 
Xhol; Cloned unidirectionally . Primer: Oligo dT . 
Umbilical vein endothelial cells, passaged once. Average 
insert size: 1.0 kb; Uni-ZAP XR Vector; --5* adaptor 
sequence: 5* GAATTCGGCACGAG 3* -3* adaptor sequence: 5' 
CTCGAGTTTTTTTTTTTTTTTTTT 3 ' " 
95 a 69 c 72 g ■ 134 t 3 others 



Query Match 68.7%; Score 70.8; DB 22; Length 373; 

Best Local Similarity 81.8%; Pred. No. l,9e-09; 

Matches 81; Conservative 0; Mismatches 18; Indels 0; Gaps 

Qy 4 ttctccaagctactcagaagactgaagcagaaggatcacttgaggccaggagttcaagat 63 

I I II III MINI II I I I I III I I I II I I II I II I II I I I I M III 
Db 125 TACCCCCAGCCACTCAGGAGGCTGATGCAAGAGGATCGCTTGAGCCCAGGAGTTGAAGTC 66 




Qy 64 cagcctgagcaacatagngaaaccctatctctaaaaata 102 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 65 CAGCCTAAGCAACATAGTGAAACCCCATCTCCAAAAATA 27 



Query= SEQ ID NO: 16 

(638 letters) 



Score E 

Sequences producing significant alignments: (bits) Value 

AL021391 ACCESSION:AL021391 NID : gi 4467344 emb AL021391.2 HSIO . . . 347 2e-92 

>AL021391 ACCESSION:AL021391 NID: gi 4467344 emb AL021391.2 HS102D24 

Human DNA sequence from clone RP1-102D24 on chromosome 22 
Contains a novel Mitosis-specific Chromosome Segregation 
protein SMCl LIKE protein gene, a novel unknown gene, and 
the first coding exon of the FBLNl gene for Fibulin 1. 
Contains ESTs, STSs, GSSs and putative CpG islands, 
complete sequence 
Length = 138129 

Score = 347 bits (175) , Expect = 2e-92 
Identities = 175/175 (100%) 
Strand = Plus / Minus 

Query: 3 95 aggaggtggacagtgaacacagaaaagctgtaaggtgtcctgtgacagatgtatgtggtg 4 54 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbjct: 90259 aggaggtggacagtgaacacagaaaagctgtaaggtgtcctgtgacagatgtatgtggtg 90200 
Query: 455 gacacagcaggacccagaggaaggaagaaagaagctgctcttgaaaagaccctcaaacca 514 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIII 

Sbj ct : 90199 gacacagcaggacccagaggaaggaagaaagaagctgctcttgaaaagaccctcaaacca 90140 
Query: 515 cgatgctcaaggaagtgtcgagagatgaaggagaggtgtttgccaggcagagcag 569 

IIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIMIIIIIIIII 

Sbjct: 90139 cgatgctcaaggaagtgtcgagagatgaaggagaggtgtttgccaggcagagcag 90085 



Score = 248 bits (125) , Expect = le-62 
Identities = 127/128 (99%) 
Strand = Plus / Minus 

Query: 270 ggcctctgcgagactgtttcatagatgctcaagacaccagcaaaccagngccaccgaaca 32 9 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii iiiiiiiiiii 

Sbjct: 90631 ggcctctgcgagactgtttcatagatgctcaagacaccagcaaaccagtgccaccgaaca 90572 
Query: 330 agtatgagaaaagaacaggctagattatgttatccagaacttcacaaccatcagatctag 3 89 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbjct : 90571 agtatgagaaaagaacaggctagattatgttatccagaacttcacaaccatcagatctag 90512 



Query : 3 90 acagaagg 3 97 

Mllllll 

Sbjct: 90511 acagaagg 90504 

Score = 111 bits (56) , Expect = 2e-21 
Identities = 64/67 (95%) 
Strand = Plus / Minus 

Query: 568 agtagagacaagttttcgccatgttggtcaagctggtctcaaacttctaacctnacgtaa 627 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiii iiiiiiiiiiiiii iiiiiii mill 

Sbjct: 89080 agtagagacaagttttcgccatgttggtcaggctggtctcaaac tcctaacctcacgtaa 89021 
Query: 628 tccaccc 634 

IIIIIM 

Sbjct: 89020 tccaccc 89014 

Score =75.8 bits (38), Expect = le-10 
Identities = 46/50 (92%) 
Strand = Plus / Minus 

Query: 219 ccaggttnnagtgattcccgtgcttcngnctcctgagaagctgggattac 268 

IIIIIII iiiiiiiiiiiiiiiii I iiiiiiiiiiiiiiiiiiiii 

Sbjct : 94134 ccaggt tcaagtgat tcccgtgcttcagcctcctgagaagctgggattac 94085 



Query= SEQ ID NO : 17 

(403 letters) 



Score E 

Sequences producing significant alignments: (bits) Value 

AC015933. 9. 1.249021 668 0.0 



>AC015933 .9.1.249021 

Length = 249021 

Score = 668 bits (337), Expect = 0.0 
Identities = 383/402 (95%), Gaps = 3/402 (0%) 
Strand = Plus / Plus 



Query: 3 aaagagaaaaacaacattcaacancaacancaatttcccgaggatccctgcccacattca 62 

lllllilllllillll lllll Mill llllllllllllllllllllllllllllll 

Sbj ct : 224797 aaagagaaaaacaacaa-caacaacaacaacaatttcccgaggatccctgcccacattca 224855 



Query: 63 nagt-gncacatttacctacttnanaggggagatnaaagccncactctaaggctccttat 121 

III I II iiiiiiiiiiii I II mill mill mmmiimim 

Sbjct : 224856 gagtag-cagatttacctac t tcaaagtggagatcaaagccacactctaaggctccttat 224914 

Query : 122 ttccacaggctggnaagcaaacanggcntacaggctttgcangagtgtatcctaattctc 181 

iiiiiiiiiiiii iiiiiiiii III miiiiiiiiii iiiiiimiiiiiiiii 

Sbjct : 224915 t tccacaggctggcaagcaaacaaggcatacaggctttgcaagagtgtatcc taattctc 224974 



Query: 182 ttactgaagaaaagtcaacagcagagacancacagaaaaaggaatcaaagaggccaaatc 241 

I III 1 1 1 Ml III III I III I III 1 1 1 M III I III I II Ml II 1 1 1 1 II 1 1 II 1 1 M i 

Sbjct: 224975 t tactgaagaaaagtcaacagcagagacaacacagaaaaaggaat caaagaggccaaatc 225034 



Query: 242 tgnggactcaaaacaataagaaaaaataaatcaactttgctaaaatttaagaatgccagg 3 01 

II IIIIIIIIIIMIIIIMIIIIIIIIIIMIIIIIIIIMIIIIIIIIIIMIIIII 

Sbjct : 225 03 5 tgtggactcaaaacaataagaaaaaataaatcaactttgctaaaatt taagaatgccagg 22 5 094 
Query: 3 02 ggggtaggtaaatgcactgggaagtatgtgtggactatgatgataataaatctcctttca 3 61 

IIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIII 

Sbjct : 2250 95 ggggtaggtaaatgcac tgggaagtatgtgtggactatgatgataataaatctcctttca 22 5154 



Query: 362 atacaactgatatttatcagaccttgaataaaacactgaatg 403 

iiiiimiiiiiiiiiiiiiiiiiiiiiimiiiiiiiii 

Sbjct: 225155 atacaactgatat t tatcagaccttgaataaaacactgaatg 225196 



Query = SEQ ID NO: 18 

(103 letters) 



Score E 

Sequences producing significant alignments: (bits) Value 

AL360270 ACCESSION : AL3 602 70 NID: gi 11121069 emb AL360270.18 H. . . 19;8 le-48 

>AL360270 ACCESSION :AL360270 NID: gi 11121069 emb AL360270.18 Human DNA 
sequence from clone RP11-96K19 on chromosome 1, complete 
sequence 
Length = 172805 

Score ^ 198 bits (100) , Expect = le-48 
Identities = 102/103 (99%) 
Strand = Plus / Plus 

Query : 1 actttctccaagctactcagaagactgaagcagaaggatcacttgaggccaggagttcaa 60 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

Sbjct: 93618 actttctccaagctactcagaagactgaagcagaaggatcacttgaggccaggagttcaa 93677 
Query: 61 gatcagcctgagcaacatagngaaaccctatctctaaaaatac 103 

iiiiiiiiiiiiiiiiiiii iiiiiiiiiiiiiiiiiiiiii 

Sbj ct : 93 67 8 gatcagcctgagcaacatagtgaaaccctatctctaaaaatac 93 72 0 



